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Abstract 

We process snapshots of trajectories of evolution equations with intrinsic symmetries, and 
demonstrate the use of recently developed eigenvector-based techniques to successfully 
quotient out the degrees of freedom associated with the symmetries in the presence of noise. 
Our illustrative examples include a one- dimensional evolutionary partial differential (the 
Kuramoto-Sivashinsky) equation with periodic boundary conditions, as well as a stochastic 
simulation of nematic liquid crystals which can be effectively modeled through a nonlinear 
Smoluchowski equation on the surface of a sphere. This is a useful first step towards 
data mining the "symmetry-adjusted" ensemble of snapshots in search of an accurate 
low- dimensional parametrization (and the associated reduction of the original dynamical 
system). We also demonstrate a technique ("vector diffusion maps") that combines, in a 
single formulation, the symmetry removal step and the dimensionality reduction step. 

Keywords: Dimensionality reduction, heat kernel, local principal component analysis, 
alignment 



1. Introduction 

High-dimensional dynamical systems are often characterized by low- dimensional long- 
term dynamic behavior. Obtaining reduced- dimensionality models consistent with this 
behavior is clearly very useful in both analysis and in computations. While such model 
reduction can be based on properties of the dynamics ( e. g. Center-Manifo ld or Lyapunov- 
Schmidt reduction, see iGuckenheimer fc Holmes! ( 2002 ); Neumaier ( 2001), or Ine r tial an d 
Approximate Ine r tial Manifolds, see Ijollvl (ll989l) : IJollv et all (ll990h : IFoias et all (Il988af ): 
Constantin et all ( ll988f ) ; IFoias et all f)l988bh : iTitJ (ll990f ): lFoias et all (Il989l ) ) , semi-empirical 
methods based on data -mining are also enjoy i ng; ext e nsive use in app l icatio n s (e.g. PCA/PO D- 
Galerkin methods, see iKunisch fc Volkweinl d2008h : berkooz et all ( jl~99~3h : iBerkooz fc Titil 
( 1993 ); Sirisup et al. ( 20051 ) ). As nonlinear extensions of Principal Component Analysis 
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are developed (e.g. technique s like Isomap, Local Li n ear Embedding , Lapl a cian Eigen- 



maps /Diffusion Maps , etc., s eelTenenbaum et al.l (120001) ; iRoweis fc Saull (120001 ) ; iBelkin fc Niyogi 
(120031 ): iNadler et all (l2006h : ICoifman k Lafonl (12006^ . the necessity of linking these 



non- 



linear data reduction techniques with dynamic model reduction naturally arises. 

When the data set of interest consists of snapshots of trajectories of dynamical sys- 
tems with symmetry, "factoring out" this symmetry is an established first step (in theory, 
in computations, as well as in PCA-based data mining); the use of so-calle d "template 
functi o ns" in this context has bee n describ ed by Rowley and c o workers (e.g. [Ahuja et al 



(l2007h : iRowlev fc Marsdenl d2000h . see also lAubrv eTall (Il993h : iHolmes et all ( 1l998f )l In 
this paper we explore the application of recently developed computational approaches to 
symmetry removal ("factoring out" symmetry, "alignment" of the data) for (noisy) high- 
dimensional dynamical system data. Our illustrative examples include (1) the discretiza- 
tion of a well-known spatiotemporal pattern- forming partial differential equation (PDE), 
the Kuramoto-Sivashinsky Equation (KSE) in one spatial dimensional and with periodic 
boundary conditions (with associated symmetry group 5*0(2)); and (2) a stochastic simu- 
lation of a nonlinear 2D Smoluchowski equation, where the evolution of the orientational 
distribution function of an ensemble of nematic liquid crystals is modeled on the sphere 
(with associated symmetry group 50(3)). In both cases noise is present in the data; in 
the KSE case the noise is added externally (by us); in the nematic liquid crystal case the 
noise comes from the stochastic simulation of a finite ensemble of representative particles. 

The essential step in factoring out the relevant symmetries involves relating each snap- 
shot in the data to each other snapshot (in effect, using each snapshot as the "align- 
ment template" for every other snapshot); using these pairwise relations to perform a 
global alignment can be formulated as an optimization probl em that is fru i tfully relaxed 



to an eigenproblem ( hence the term "eigenvector method," see ISingerl (120111 ); ISinger et al 



Singer fc Shkolniskyl ) 



In one of our examples (the KSE) we will also demonstrate the combination of this 
"alignment" with a second, data mining (di mensionality re duction) step; the combination 
carries the name of "vector diffusion maps" (ISinger &: Wul ) and has potential advantages 
over the "two step" approach (first alignment and then reduction). The data set corre- 
sponding to the snapshots of the dynamical system is usually modeled as lying on a low 
dimensional manifold Ai. In the presence of a symmetry group G (such as 50(2) or 
50(3)), vector diffusion maps provide a natural framework to organize the data in the 
quotient space Ai/G. The affinities between data points are related to their correlation 
when they are optimally aligned, and the information about the optimal alignment trans- 
formation (the group element) is also encoded in this framework. The advantage of working 
in the quotient space Ai/G stems from its lower dimensionality compared to the original 
manifold Ai, giving rise to improved dimensionality reduction, noise robustness, and the 
need for less data. 

The paper is organized as follows. In Section [2j we give an overview of the "alignment" 
problem and briefly review template-based methods. Next, Section [3] summarizes the 
eigenvector method and some of its relevant mathematical properties. Sections H] and [5] 
are devoted to applying and comparing template-based approaches and the eigenvector 



2 



method to our two prototypical examples. Finally, in Sections |6] and U\ we demonstrate 
the use of two dimensionality reduction techniques, diffusion maps and vector diffusion 
maps, on the modulated traveling wave data of Section 



2. Description of the problem 

For physical systems possessing symmetry, there may be several equivalent realizations 
of what is effectively the same system state (whether a steady/stationary state or an 
"instance" or "snapshot" during a transient simulation); these realizations are related by 
some underlying symmetry group. When such systems with symmetry evolve in time, 
their dynamics are equivariant with respect to the appropriate symmetry group. Consider 
a function u(9, t) on the unit circle evolving according to some spatially invariant differential 
operator T> via an equation of the form 



u t = V{u). 

This equation is equivariant in the sense that 

V{S c [u\) = S c [V(u)}, 



(1) 



(2) 



where S c [v}(9) = v{9 + c) is the shift operator on spatially periodic functions; starting at a 
particular snapshot, evolving the dynamics for some time and shifting the final state by c is 
the same as the result of shifting the initial snapshot by c and then evolving the dynamics 
from the shifted initial condition (in other words, the differential operator T> commutes 
with the shift operator S c ). 

Suppose we take M snapshots of u at M different times, {u(9,tk)}^ =1 . If u(9,t) is 
not changing its shape, but simply traveling around the unit circle (for example, when 
T>(u) = uug), we may identify each snapshot with some angle 6 6 [0, 2ir). By rotating each 
of these snapshots "back" by the angle 9 with which it has been identified, we obtain a set 
of identical system snapshots (thereby removing one degree of freedom from the evolving 
system) . 

The removal of this degree of freedom allows us to perform certain tasks, such as 
denoising a collection of snapshots through averaging (in Singer et al. ; Singer &: Shkolniskyl . 
a similar procedure is used on cryo-EM data), more easily. In the case where u(9,t) is 
evolving its shape in addition to traveling (for example, when T>(u) = uug + £ (u), where 
£(u) is some other nonlinear spatially invariant operator), removing this "traveling" degree 
of freedom from the simulation can significantly assist our understanding of the dynamics. 
See Figure [T] for an illustration. For instance, when one uses diffusion maps to explore 
whether the simulation data are intr insically low-dimen si onal, and to find go o d "coarse" 
param e trizations for them (see , e.g., lLafon fc Led (|2006T ): ISondav et all fl2009h : lDas et all 



fl2006[ ) 



Coifman et all (l2005fi ); lErban et all fl2007l V)7 removing the symmetry results in 
a more parsimonious description of the dynamics (an embedding in a lower- dimensional 
space), which may also be successfully deduced with far less data. 
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Figure 1: At the top, a representative schematic of a system evolving in space and time. Here, the domain 
is periodic in the spatial direction. Using a triangle shape as a template (see text for a discussion of 
templates), we spatially shift each dynamic snapshot (each time slice of the top figure) to maximally 
correlate with the triangular template. Four such maximal correlations are shown in the middle figure, 
where the shading brings out the difference between the snapshot and the template. At the bottom, we 
see that after alignment, the dynamics of the set of snapshots appears visually much simpler; the traveling 
motion is gone, and all that remains is a slight modulation. 
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Now, suppose we have an ensemble of M snapshots, but we do not know the members 
of the underlying symmetry group with which each snapshot is to be identified. We wish 
to perform this association of snapshots with symmetry group elements; in other words 
we wish to globally "align" the M snapshots. (Here the colloquial expression "alignment" 
comes from the simple conceptual example of rotationally invariant functions on the unit 
circle; the possible rotation angles can be "strung" along a line between and 2tt.) 

Normally, this global alignment (the computation of the symmetry group element iden- 
tified with each snapshot) may be accomplished numerically throu g h the use of a well- 
chosen template function (see Figure [1] and, e.g., Ahuja et al. ( 20071 ); Rowley fc Marsden 



(l2000h ) . For instance, in our running example of snapshots {u(9,tk)}^ =1 , one finds the 
alignments {0k}kLi which align each snapshot with a template T(9) by simply setting 

^ = argmin \\T(9) - u{6 + c, t k ) f. (3) 

c 

The analogue of equation holds for other symmetry groups. This approach will, in 
general, be successful when 

• there is little noise in the data; 

• a "good" template, leading to a clear global minimum, is known ahead of time; and 

• this template remains "good" in the above sense as new data are collected during 
the system evolution. 

When there is noise in the data, or when a good template is not known, "misalignments" 
may happen frequently. Furthermore, as the system evolves, a fixed template may stop 
being "good" (that is, giving rise to a clear global minimum in the above optimization 
problem). 

In this paper we apply a novel spectral algorithm (ISinger to solve this problem 



of global alignment in the presence of symmetry. In contrast to the method of templates, 
which compares snapshots one by one to a fixed "template function" (producing M pieces 
of information), the eigenvector method compares all snapshots to all other snapshots pair- 
wise, in essence treating every snapshot as a template (and thereby exploiting a greater 
amount of information, namely M(M — l)/2 pieces). Even though many of these pair- 
wise comparisons may be inaccurate due to noise inherent in the snapshots, consistency 
relationships among these pairwise alignments can be used to gain a sense of the overall, 
global alignme nt. A slight modification of this algorithm known as vector diffusion maps 



(ISinger fc Wul ) allows for the situation in which the snapshots differ not only by a sym- 
metry group element (and noise), but also because there is a systematic change in the 
snapshots due to the underlying dynamic evolution. Both algorithms are fast, simple, and 
(as we will demonstrate) more robust to noise than their corresponding template-based 
approaches. 

The "eigenvector algorithm" will be illustrated through two prototypical examples. 
The first involves the evolution of orientational distribution functions of nematic liquid 
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crystal polymers; the distributions are functions on the sphere, and we take the associated 
symmetry group to be SO (3). The second involves spatiotemporally traveling/modulating 
waves of the Kuramoto-Sivashinsky equation (KSE); these are functions on the unit circle 
with periodic boundary conditions, and we take the associated symmetry group to be 
50(2). Additionally, for the case of the KSE waves, we demonstrate the use of vector 
diffusion maps to (all in a single step), remove the underlying symmetry and capture the 
low- dimensionality of the underlying dynamics (the residual dynamics of modulation after 
the "traveling" symmetry has been removed). 



3. The eigenvector alignment method 



In its most general form, the eigenvector alignment method (jSingeri (120111 )) can be sum- 
marized as follows. Consider an ensemble of M snapshots which are identical, except for 
the action of some underlying symmetry group G (such as spatially periodic translation) 
and perhaps some noise. We wish to know the group elements {gi}^ =1 G G with which 
the M snapshots may be identified; this will give us information which can be used to, for 
instance, ascertain what "rotation" to perform to make a particular snapshot equivalent 
to another (i.e. to "align" the two snapshots). Specifically, if we identify snapshots i and 
j with group elements ^ and gj, then rotation of snapshot i by = g^g^ x should make 
it identical to snapshot j. In our simple illustration of periodic functions u(8,t) traveling 
around the circle with speed u (dynamics Ut = umo), the symmetry group elements are 
angles modulo 2n (the group is SO{2)). Each snapshot u{6,ti) can, in principle, be iden- 
tified with some angle Q{. Snapshot i may be made equivalent to snapshot j after a (say, 
systematically counter-clockwise) rotation of snapshot % by = 9j — 9 { . 

When the snapshots are noise-free, obtaining the {gi\fL x may be done easily as follows. 
Choose one base snapshot, or "template," say snapshot i. For this snapshot i, choose a 
particular random assignment g^. For each remaining snapshot j, find the gij G G which 
rotates snapshot i to be identical to snapshot j, and then set gj = g^gi- Alignments 
between any two snapshots p and q can then be computed as g pq = g^giq- In the example 
of angles modulo 2tt, this means choosing some base 9i for snapshot i, then setting 6j (for 
each of the remaining snapshots) to be 6j = 6ij + 9^ (where 0y is the angle which rotates 
snapshot i to be identical to snapshot j). Alignments between any two snapshots p and q 
can then be computed as 9 pq = 6 iq — 9 ip (to get from snapshot p to q, rotate snapshot p 
back to snapshot i, then rotate i to q). 

Because the method above relies on using only a single template, it may well not be 
robust to noise; obtaining the gj may not work well because many of the {gij}^Li will be 
computed incorrectly. The eigenvector method instead has the user compute all {gij}ffj=i 
(in essence, treating every snapshot as a template); it then looks for consistency along 
these pairwise alignments to assign the global alignments {gi}fL\- The main idea is as 
follows: if gij, gjk, and g^ are accurately measured, we also expect, for example, that 

gik = gijgjk, (4) 
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a condition known as the triplet consistency relation. In our example of angles modulo 2ir, 
this simply says that, regardless of whether snapshot i or j are used as the template, the 
angle between snapshot i and snapshot k should be the same no matter if it is measured 
directly (Oik) or inferred (Oij + Ojk). Analogously, we also expect "higher-order" consistency 
relations of the form 

9u — 9ij9jk9kb (5) 

Since many of the measurements of may be inaccurate, equations (jlj), (jSJ), and their 
high-order forms will often be violated; however, one can still hope to assign the gi in some 
sort of globally optimally consistent way. 

Initially, one may attempt to assign the gi so that as many pairwise measurements 
gij as possible are satisfied to within some tolerance. Unfortunately, for even a moderate 
number of group elements M, it is computationally intractable to find the assignment of 
the gi which maximizes the number of them which are satisfied (to within some tolerance). 
This is a non-convex optimization problem in a very high dimensional space. As we discuss 
now on the example of angles modulo 2n, a relaxat i on of t he problem to a quadratic (and 



therefore convex) form has been proposed (jSingerl ( 1201 ll )). The only requirement is for 



the symmetry group G to have a compact real/complex form. The relaxation makes the 
optimization problem more tractable, but it also allows for the "solution" $ to include 
elements not necessarily in G (we will explain this and show how it can be rectified below). 

Again, consider the problem of angles modulo 2ir. This group has a compact complex 
representation given by mapping Oi to e l9 \ Measurements of 0y, which are (noisy) mea- 
surements of 0j — Oi, are represented similarly as e t9ij . At first, one might wish to formulate 
the problem so as to assign the global alignments Oi in order to maximize the number of 
pairwise measurements which hold true to within some tolerance tol, for instance 

argmax : —tol < Oj — Oi — 0y (mod 2n) < tol}. (6) 

{6i\ 

This problem becomes quickly computationally intractable for large M, even after a refor- 
mulation to the form 

argmin V / [0 } - 4 - tJ (mod 2n)] , (7) 

<*> (M) 

where / is some smooth periodic penalty function. 

Instead, the problem is relaxed as follows: the measurements Oij are inserted into a 
matrix H so that Hy = e~ ldij . We now consider maximizing the following quantity: 

M 

argmax V e^H^e^. (8) 

{*} Ti 

When the Oi are correctly assigned, each "good" measurement of Hy contributes close to 
1 in the sum and each "bad" measurement contributes, on average, to t he sum (since the 



errors are assumed to be uniformly randomly distributed, see ISingerl ( 1201 if )). Therefore, the 



maximization of the expression (JSJ) is likely to produce, in some sense, maximally consistent 
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assignments of the 6{. To make the problem even more tractable, it is further relaxed to 
a quadratic form (general complex numbers, as opposed to complex numbers on the unit 
circle only) which can be easily solved with power iteration: 



M 



argmax 

{z 1 }ec,J2\z l \2=M 



tj z j- 



(9) 



Maximizing the expression ([9]) amounts to finding the largest eigenvector v of the Hermitian 
matrix H. The components of the largest eigenvector v are not necessarily of unit length, 
but after normalization, one can define the estimated angles by 



vli) 



V{1 



(10) 



It is interesting to note that the error of the assignments Q{ can be estimated by looking 
at the eigenvalue spectrum of H. Consider, for instance, the correlation p between the 
eigenvector v and the vector z of true angles as a measurement of "goodness of fit" ; this is 
given as 

1 



P 



M 

E 



e~ l %(i) 



(11) 



Under certain assumptions about the type of noise in the problem, one can show that 

1/ \i2 ^ -Vff — ^R 



Mp 



(12) 



whe re M is as ab ove, and p is a quantity related to how likely "good" measurements are 



sec 



Singerl (120111 ) for details). Here Xh is the leading eigen vector o f the matrix H; if the 



(rando m) matrix H has a number of properties (again, see ISingerl (120111 ); iFeral fc Peche 
( 120071 )) its eigenvalue distribution will include a semicircle, and the right edge of this 
semicircle will be the quantity Xr. Furthermore, 



A 



R 



2^/M(l-p 2 ) 



and 



E [A 



Mp + 



1-p 2 
p 



(13) 



(14) 



where equation ( 114)) is yalid wheneve r p > 1/yM and the variance in the quantity A# 
increases as p decreases (ISingerl (l201lh : IFeral fe Pechel f|2Q07h ). 

Although the noise model presented in ISingerl ( )201ll ) is different than the noise in our 
problems, equation ( 171]) holds regardless, and we still expect the alignment error to decrease 
as both M and p grow (more data/pairwise comparisons and higher quality measurements, 
respectively, will lea d to a better re c overy of the global alignments). 

We also note that lFeral fc Pechel (l2007h requires the noise in every entry of the matrix H 
to be independent. This is not necessarily true in our examples. It is likely that "good" and 
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"bad" measurements are not random, but rather, correlated; having independent entries 
requires M 2 sources of randomness, and clearly, for large enough M, this will cease to be 
true because the "amount" of randomness scales only as M, the number of snapshots. For 
large M this argument can rationalize why some eigenvalues (with magnitude of O(M)) 
may appear outside the theore tically expected s emicircle (see, e.g., Figures |5] and [15]) . This 
phenomenon is investigated in lCheng fc Singer . 



4. The first illustrative example: orientational distributions of nematic liquid 
crystal polymers 

Symmetry often plays an important role in systems with spontaneous spatiotemporal 
pattern formation; such systems, typically modeled thro ugh partial differential equa tions, 
arise naturally in mo deling reaction - diffus ion and/or flow (ICross fc Hohenberd (Il993r0. but 
also n onlinear optics ( lArecchi et al.l ( 119991 ) ) and Bose-Einstein condensates (IKevrekidis et al 
(120081 ) ). If the computational models are in the form of stochastically interacting parti- 
cles, the finite number of the simulated particles and the stochasticity of their evolution 
naturally gives rise to noise in the recorded time series (and we know that the fewer the 
particles, the "larger" in some sense the noise will be). To illustrate this, and to show how 
to factor out symmetries at the "macroscopic" level while working with a "microscopic," 
particle based, noisy simulation, we chose an illustrative example for which good models 
exist at both the particle- and the continuum levels. The system in question is the evolution 
of the single particle orientational probability distribution function in the case of nematic 
liquid crystals; a closed equation tha t very successfully approximates this evolution is a 
Smoluchowski equation ( jSiettos et al.l ( 120031 )). An alternative description of the dynamics 
comes in the form of coupled stochastic differential equations which model the interactions 
of a (large but) finite number of nematic liquid crystal polymer molecules; one hopes that, 
for a large enough number of simulated interacting particles, the computed evolution of 
their collective orientational probability distribution approximates the trajectories of the 
(mesoscopic) Smoluchowski equation. 

It is well known (and can be seen from the form of equation [TS] below) that the evolu- 
tion of the orientational probability distribution is characterized by equivariance: rotating 
the initial distribution on the unit sphere and evolving commutes with evolving for the 
same amount of time and then rotating the final distribution. This implies that experi- 
ments (or simulations) differing by some (unknown) mesoscopic rotation of the entire initial 
distribution should, in effect, produce the same results (modulo the effects of noise). 

4-1- System setup 

Liquid crystalline polymers (LCPs) are large molecules which contain long rigid seg- 
ments. Groups of LCPs are capable of displaying rich behavior including high modulus 
in the solid phase, low viscosity in the melt, and many other interesting and/or desirable 
physical properties. Each LCP can be thought of as a "needle," whose orientation may be 
described as a pair of antipodal points ±w (the "tips" of the needle) on the unit sphere; 
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as the number of LCPs in a group becomes large, the evolution of the single-particle ori- 
entational probability distribution function i/)(u) of the group is accurately described by 
the Smoluchowski equation 



dip(u) 
~dT 



D 



d_ 
du 



dijj(u) 
du 



+ ^(u) 



d_ fV[ip,u] 
du 



kT 



(15) 



Here, u is a unit vector describing orientation, d/du is the gradient operator restricted 
to the unit sphere, k is Boltzmann's constant, T is the absolute temperature, D is the 
rotational diffusivity (here set to 1), and Vfy, u] is a nematic potential (a free energy 
taking into account excluded volume effec ts) . For our simulations, we use the Maier-Saupe 
potential (see, e.g. iMaier fc Saupel ril959h ) 



-Uuu : S, 



(16) 



where S = (uu) — |l is the tensor order parameter. The parameter U (the intensity 
of the nematic potential) can be thought of as proportional to the concentration of the 
LCP "rods". If A is the eigenvalue of S wit h the largest magn itude, the so-called scalar 
order parameter S is given by S = 3A/2 flSiettos et all (j2003h ). Writing equation ( fl5|) 
as di[)(u)/dt = V(if)(u)), the Smoluchowski equation is equivariant in the sense that 
P(^(Ru)) = RD()|i(u)), where R is a member of 5*0(3). 

Computationally, the evolution of the distribution function can be simulated as a large 
set of coupled stochastic differential equations. One simply represents the distribution ip{ u ) 
as a collection of N representative individual LCPs, and then computes their trajectories 
{±Wj(t)}^ :1 (here, {±w i (t)}^ 1 are vectors on the surface of the sphere, and the "±" is 
because each LCP is really a rod with identical "top" and antidiametric "bottom"). Ini- 
tializing a dist ribution t/yi(u) wi t h N p articles may be done with the Metropolis-Hastings 
algorithm (see iMetropolis et al.l ( 19531 ) or the Appendix), and as N goes to infinity, this 
initialization converges in measure to ipo(u). Using the N particle trajectories, ensemble 
averages (/(u(t))) at any time t may be evaluated as J2iLi /( w «(^)) + /(~ W «W) (where 
here, again, we have a "— " due to the fact that each LCP has a top and a bottom). 
The distribution ipt{ u ) at time t may also be reconstructed by a variety of techniques; 
here we choose to do the reconstruction by evaluating ensemble averages of the form 
^jEh^^'W) + yj m *(— Wj(t)) (these Y™* are the spherical harmonics coefficients 
of T/>t(u), see Section I4.3p . The explicit Euler-Maruyama integration of each individual 
(stochastic) trajectory takes the form 



Wi{t + At) 



Wi(t) 



Wi(*) - 



kT au 



V2DAb 



(17) 



By using different numbers for N, the errors in the initialization of ipo{ u ), the computations 
of the (f(u(t))), and the reconstruction (from the particles) of ^(u) can be controlled, 
since they scale as 1/ vN. 
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The evolution of the Smoluchowski equation is equivariant under the group 50(3); 
rotating a given orientational probability distribution by some element R of SO (3) and 
evolving is the same as evolving first, and then rotating the result by the same group ele- 
ment. In an SDE reformulation of the problem, an orientational probability distribution is 
represented by N particles. For purposes of computational exploration of its evolution, a 
particular ensemble of N particles is equivalent to any other ensemble in which each of the 
iV particles Wj is rotated by the same element of the SO (3) group, w* — > Rw. ; . Further- 
more, due to the randomness of the Metropolis-Hastings algorithm, each initialization of 
ipo{ u ) leads to a different initial ensemble of N particles (which will accurately represent 
^o( u ) as iV goes to infinity, but which represent ipo{u) noisily for finite N). Thus, in the 
limit of infinite N, a particular ensemble of N particles initialized consistently with a par- 
ticular initial probability distribution ipo(u) is equivalent to another ensemble initialized 
consistently with f/> (Ru): the original distribution, but rotated by a member of R of the 
SO (3) group. For finite N, there is noise, and these two ensembles of N particles are only 
approximately the same after rotation by a member of S0(3). Finding this corresponding 
member R of 50(3) becomes increasingly difficult as iV gets smaller. 

Suppose we are given a set of M LCP ensembles, each initialized with N particles, each 
consistently with ^ (Rj u ) f° r some unknown rotation {Rj}^ x G 50(3); and let us evolve 
each of these ensembles for some fixed time T. The result is a set of M ensembles of iV 
particles which should be approximately the same after each is rotated by R^ 1 = Rf G 
50(3) (the difference is due to the finiteness of N). We wish to be able to consistently 
determine the unknown members {Rj}^ so that we know how to relate each ensemble of 
iV particles to each other ensemble. When N is small (equivalently, when the "noise" is 
large), misalignments are bound to occur frequently. Therefore, as before, we expect the 
eigenvector alignment method to outperform a method based simply on a fixed template 
function. 

4-2. Consistent initialization of LCP distributions 

In order to compare the performance of the eigenvector alignment method with that 
of the classic template method, we must first generate appropriate data. For chosen M 
(number of ensembles) and N (number of particles), this can be accomplished by first 
generating M random members {Rj}^ of 50(3), and then initializing M ensembles of 
N particles according to the distributions ipo(Ri u ) via the (random) Metropolis-Hastings 
algorithm. 

This is illustrated through four plots in Figure [2j here, we have plotted both the "top" 
and "bottom" (which are interchangeable) of each of the N LCP particles. For two random 
rotation matrices Ri and R 2 , and for both N = 50 and N = 5000, we show initializations 
with respect to the probability distribution functions ^o(Riu) and V ; o(R2 u )- Here, we 
selected and initial probability distribution which resembles a "P" shape (along with its 
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Figure 2: Initial distributions of four LCP ensembles from two different rotation matrices (top and bottom 
rows, Riu and Rail), and with two different numbers of points (N — 5000 left, N = 50 right). Shown are 
both the "top" and "bottom" (interchangeable) of each LCP particle. 

reflection through the origin). It is given by 

z > and |x| < 0.1 and y < 1/^2 
z < and |x| < 0.1 and y > -l/y/2 
z,y>0 and 0.4 < ^x 2 + y 2 < 0.6 , (18) 
z, y < and 0.4 < ^x 2 + y 2 < 0.6 

else 

where Norm is some normalization so that ipo integrates to 1. We subsequently evolved 
these four ensembles for a fixed amount of time T using the algorithm ( fTTl) . The resulting 
ensembles are shown in Figure [31 

In the numerical experiments to follow, we choose various values of both M and 
N, thereby generating M different ensembles of N particles corresponding to ^o(Riu), 
^o(R-2u), . . ., ip Q (n M u). We then integrated each ensemble for a fixed amount of time T 
using the Euler-Maruyama scheme (see equation ( fTTl) ). obtaining M distributions corre- 
sponding to ^t(Riu), ?/>t(R-2u), . . ., ^t(Rmu). These distributions differ by (a) a rota- 
tion; (b) the particular consistent initialization of the iV particles; and (c) the particular 
(stochastic) particle sample paths computed through the Euler-Maruyama integration. 

Given only the noisy particle distributions obtained at time T, we wish to determine 



' 50/51 
j 50/51 
i> (u=(x,y,z)) = — -I 50/51 

50/51 
k 1/51 
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Figure 3: The final distributions after integration by T — .Is of the LCP ensembles initialized as in Figure 
Shown are both the "top" and "bottom" (interchangeable) of each LCP particle. 
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the M unknown rotation matrices Ri, R2, . . . , Ra/. When N is small, the noise (which 
scales as 1/y/N) makes this particularly challenging. 

4-3. Alignment of LCP distributions 

Pairwise alignment was performed by both the template method (alignment of each 
ensemble member with a fixed template) and by the eigenvector method (alignment of each 
ensemble member with each other ensemble member). In our work, we utilized the spherical 
harmonics components of the orientational distribution functions (computed based on the 
particle states) to perform pairwise alignment of every pair of ensembles of N representative 
particles. Akin to a Fourier basis on the sphere, spherical harmonics take into account 
not only lower-order information such as the center of mass of the distribution (the first 
three nontrivial spherical harmonics), but also its higher-order moments. Additionally, 
the leading spherical harmonics coefficients can be used to quickly compare functions and 
rotated versions of these functions on the sphere (see below), so they are useful for finding 
optimal pairwise alignments (required by the eigenvector alignment method). 

To align two ensembles of N particles, we first approximated computationally the lead- 
ing coefficients of the spherical harmonics expansion of both particle distributions. Let the 
spherical harmonics expansion of the first distribution be (approximately) given by 

'max 1 

/(M)= E E fryrvA)- m 

1=0 m=-l 

Here, fj 71 is computed as an integral over the surface of the sphere fl via 

fr= ! m^Yrie,^-, (20) 

Jn 

by representing the particles as delta functions, equation f )20|) is approximated as 

1 - 

i=l 

where 0$ and fa are the (9, fa) spherical coordinates of the zth particle's orientation vector 
Wj in the distribution (and we include (71 — 9i, fa + n), of course, because each LCP has a top 
and bottom which are interchangeable). It is clear that only the even spherical harmonics 
coefficients survive; for the odd ones, F / m *(6 l j, fa) + YJ m *(7r— 0j, 4>i+tt) equals zero. Similarly, 
second distribution g(9, fa) may be approximately described by its coefficients g™. 

The squared L 2 difference e(/, g) between the two functions can then be approximated 

as 

'max l 

<f^)= E E Wf?-9?\\ 2 - (22) 

1=0 m =-l 

Once the spherical harmonics expansion of a function h(u) is known, the spherical har- 
monics expansion of Jir = /i(Ru) can be computed quickly; therefore, it is only necessary 
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to perform the time-consuming calculations in equations (fl9|) and (|20|) once (these might 



be spe d up by FFT-type fast algorithms which we did not use, see, e.g. iRokhlin fc Tygert 



(120061 ) ). In order to find the rotation matrix R that best aligns two distributions of iV 
particles with respect to L 2 , we may simply compute 



R = argmin e(/, g n ) 

RG50(3) 



(23) 



Our rotations of the spherical harmonics were performed using the freely available software 



archive SHTOOLS available at www.ipgp.fr/~wieczor/SHTOOLS, and we computed the 
best R by exhaustively searching over SO (3) with a mesh of two degrees precision in each 
of the 9, 4> directions. We thus obtained a good initial guesses, for each snapshot, of the 
sought rotations, and subsequently used Newton iteration to more accurately determine 
the optimal R. 

4-4- Template-based alignment attempts 

Using the spherical harmonics machinery, we first attempt to align the set of M en- 
sembles of N particles through the use of fixed templates. Somewhat arbitrarily, we chose 
the three fixed templates shown in Figure HI One of the template functions (Template |— ) 
resembles the orientational distributions of Figures [2] and El we anticipate that at least this 
template will be useful in aligning the data. Nevertheless, the global alignments obtained 
with all three fixed templates fall short of those obtained with the eigenvector method (see 
Tabled]). 

4-5. Application of the eigenvector method 

The first step in aligning the data through the eigenvector method is to compute 
pairwise alignments (methodology discussed in Section 14. 3[) between all M distributions, 
{Rij}*j =1 . Here, R^- is the 3x3 matrix which rotates ensemble j to ensemble i. Next, 
these pairwise rotations are inserted in a large 3M x 3M matrix of the following form: 



M 



R n 
R21 



R12 
R22 



RlAf 
R-2M 



R 



Ml 



R 



M2 



R 



MM 



(24) 



In an ideal setting with no misalignments and no noise, the ij-th block of the matrix 
M would simply be RjRj, for this is the matrix which takes distribution j back to the 
standard axes, and then rotates it by Rj in order for it to coincide with distribution i. We 
also note that in this ideal setting, the following equation holds: 

(25) 



where v is the 3M x 3 matrix 



Mv = Afv, 

Ri 
R 2 



(26) 



R 



M 
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Figure 4: The three templates utilized for alignment of the ensembles of N particles. Because Template 
|— is vaguely reminiscent of the orientational distributions of Figures p and [H it is expected that it may 
do well (the "matched filter" is mathematically optimal, see IPapoulisI (|1977l )). Nevertheless, it drastically 
underperforms the eigenvector method, along with the other two templates (see Table [TJ . Template o . 
consists of two ellipses (one centered at each pole of the unit sphere) with minor and major axes of .2 and 
.4, respectively, along with two points at (il/v 7 ^, ±l/y/2, 0), each with "mass" equal to one-quarter of 
the mass of the entire shape. The red dots in Template ( . represent points with infinite weight and are 
located at (0,0, ±1); the effect is that each snapshot is first rotated so that its center of mass lies along 
the z-axis, and then it is rotated along this axis (an 50(2) rotation) to optimally align with the broken 
semicircle "(" shape. 



Therefore, the top three eigenvectors of M (each with eigenvalue M) contain information 
about the "unknown" rotation matrices Rj. Here, the matrix M is of rank 3, and M = vv T . 
That is, M has only two distinct eigenvalues: an eigenvalue of M whose multiplicity is 3, 
and an eigenvalue of whose multiplicity is 3M — 3. It is therefore expected that the top 
three eigenvectors would not be affected too much by noise and misalignments. 

When there is some noise, the v matrix will still be resolved (but now with eigenvalues 
slightly less than M), and we expect to be able to recover the information contained in 
these columns regarding the rotation matrices Rj. The recovery will be, of course, only 
up to an orthogonal transformation which is an inherent degree of freedom: by specifying 
only pairwise rotations, one only knows how the distributions look relative to each other. 
This transformation (in effect, its three associated degrees of freedom) appears in equation 
( )25l) . for this equation holds not only for v, but also for vR for any R G SO (3). Due to the 
noise, each recovered Rj is not exactly a rotation matrix (this phenomenon is analogous 
to the Zi not being necessarily of unit length in equation (J5J) of Section [3]). However, one 
can find the closest (in Frobenius norm) rotati on matrix via t h e wel l- known proce dure: 
R, -»■ U,V„ r where U^A^ is th e SVD of R; flFan fc Hoffman! fll955f ): iKellerl (Il975j )). 

Because iFeral &: Pechel ( 120071 ) requires the noise in each element of the matrix M to 
be independent, we do not see the expected semicircle- type distribution in our Figure [5j 
notice, however, that a shape reminiscent of a semicircle can still be seen. Again, this is 
due to the fact that "good" and "bad" measurements are not random and independent, 
but rather, correlated; having independent entries requires M 2 sources of randomness, and 
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Figure 5: A histogram for the eigenvalues for M = 250 and N = 10. One eigenvalue is near 250, and there 
are fou r pairs of eigenva lues near 125, 88, 63, and 44 (these are likely due to the correlations in the matrix 



M, see lCheng fc Singerl and Section[3|). The other eigenvalues are "in the semicircle." We need the largest 



three eigenvalues and their eigenvectors to recover the rotations. So, in practice we use the eigenvalue near 
250 and the pair of eigenvalues near 125. 



clearly, for large M, this is not true because the "amount" of randomness grows only as M, 
the number of snapshots. Because of this, f or large M, some eigenvalues of O(M) appear 
outside the semicircle. See ICheng fc Singerl for details. 

The error in the global rotations recovered are shown in Table [TJ The eigenvector 
method appears quite successful: even for large amounts of noise (small N) and small 
values of M, favorable results are obtained. Even though the eigenvalue semicircle analysis 
of Section [3] was carried out for group SO (2) and not the group SO (3) of interest here, the 
distance from the leading eigenvector to the noisy semicircle still quantifies the alignment 
error. Furthermore, as expected, when both M and N both become large (large N means 
that the probability of "good" measurements goes up, and in the context of Section El that 
p —> 1), the leading eigenvalues (A#i, ^m, ^m) increase as O(M) and the position of \r 
increases as 0(a/M). These results are summarized in Tabled] 
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Table 1: Results of the eigenvector method (vs. N and M) and alignments with various templates (vs. N). 
As before, N is the number of LCP particles representing the distribution (smaller N implies more noise). 
M is the number of ensembles of N particles, meaning that we perform M(M — l)/2 comparisons when 
using the eigenvector method. Xhi, Xh2, and A# 3 are the 3 largest eigenvalues of the 3M x 3M matrix M, 
and these are the eigenvalues which contain the rotation matrix information. is the eigenvalue at the 
right edge of the semicircle. Finally, the "err" quantities describe the error in the computed R^, and are 
equal to 1/M Xa=i ||R-itrue — R-iIIf ( err A is for the eigenvector method, and the rest are for the templates 
shown in Figured]). The eigenvector method easily outperforms all templates, and as expected, the error 
grows as M and N get smaller due to less information and more noise, respectively. Xhi appears to scale 
as M, while Xh2 and Xh3 scale with M and decrease with noise. Xr increases with both noise and M (see 
Section [3|). 
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5. The second illustrative example: modulated traveling waves of the one- 
dimensional Kuramoto-Sivashinsky equation 



Symmetries play an important role in systems that exhibit spatiotemporal pattern for- 
mation (and the evolution equations that model them). When processing experimental or 
computational data that arise in observing such problems, it again makes sense to first 
factor out the underlying symmetries. As an example of such a spatiotemporal pattern- 
forming system, we choose the Kuramoto-Sivashinksy equation (KSE) in one spatial di- 
mension and with periodic boundary conditions, which can be written in the following 
form: 

xxxx 

+ a[u xx + uu x ] = 0, 

u(t,x) = u(t,x + 2n). (27) 

This well-known nonlinear PDE arises as a model in many physical cont e xts, from flame 



front p ropagation to the dynamics of falling liquid films (jSivashinskyl ( 119771 ) ; iKuramoto fc Tsuzuki 



(119761 )). It gives rise to a rich variety of spatiotemporal dynamical patterns including steady 
state multiplicity and symmetry-breaking bifurcations, as well as traveling, modulated and 
"turb ulent" wave s . It h as been shown, under certain conditions, to possess inertial mani- 



folds (iJolly et al.l ( 119901 )). implying that its long-term dynamics are low-dimensional; this 
low dimensionality, along with the rich spatiotemporal dynamics, is an important reason 
for selecting it as an illustrative example. 

5.1. System setup 

For certain values of the parameter a, the KSE exhibits attractors that are traveling 
waves that are not of constant shape, but rather exhibit spatiotemporal fluctuations; these 
are termed Modulated Traveling Waves (MTWs). Such attractors can be thought of as 
two-dimensional tori (T 2 ) in infinite-dimensional space; one "direction" around the torus 
corresponds to traveling, and the other to a periodic modulation. We will study transient 
computational data obtained in such a parameter regime; the data do not necessarily lie 
on the MTW attractors, but they are visually close enough that the two types of motion 
are visible in our plots. 

Equation (|27|) is equivariant with respect to spatial translations; therefore, the "travel- 
ing" behavior of these waves may be factored out (the underlying symmetry group is that 
of positions x modulo 2tt, or, as we referred to it before, that of angles modulo 2n - 50(2)). 
Writing equation (I2T1) as u t = T>(u), the equivariance relation becomes V(S c [v}) = S c [D(v)], 
where S' c [f](x) = v(x + c) is the shift operator on spatially periodic functions. 

Although the traveling behavior of the wavy transients can be factored out, their mod- 
ulation cannot. For an exact MTW attractor, where the modulation (as opposed to the 
traveling) is exactly periodic in time, there does exist a continuous, one-to-one map be- 
tween each phase of the temporal modulation and the set of points on the circle; yet this 
does not lead to equivariance. It is the spatial shifts of arbitrary wave profiles (not the 
temporal ones on exactly periodic attractors) that we are interested in. 
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Figure 6: A temporal snapshot (spatial profile at a moment in time) from a PDE solution close to a 
modulated traveling wave attractor for a = 32. 

An additional qualitative computational observation is that variations in the solution 
snapshots associated with the traveling component of the evolution are significantly larger 
than the variations associated with the "modulation" part, which remains after the trav- 
eling has been factored out (as will be described below). Based on this observation, we 
will still use the eigenvector method to align the data, even though in its formulation 
such a modulation is not taken into account (for a formulation which does take this into 
account, see the discussion about vector diffusion maps, Section [7J further below). We 
will compare this to align ments obtained using template-based methods (as was done in 



Rowley fc Marsdenl (120001 ) ). The output of both methods, the list of global alignments for 
each wave snapshot, can then be used to align each wave snapshot so that the traveling 
motion is factored out and we can focus on studying the modulation exclusively (for in- 
stance, through the use of diffusion maps). Figure [6] is a picture of (a transient closely 
approximating) a modulated traveling wave, and Figure [7] shows the temporal evolution of 
the wave shapes on this transient. 

5.2. Generation of snapshot data in the MTW parameter regime 

To generate an ensemble of M transient snapshots in the neighborhood of an MTW 
attractor, we begin by integrating equation ( 12?)) for an extended period of time on an 
evenly spaced grid of 128 mesh points with width 27r/128. Because the MTW behavior 
is an attractor for the system, after this long time, each u(x,t) at a fixed time t can be 
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Figure 7: A sequence of MTW snapshots demonstrating the temporal evolution of the MTW. One can 
clearly see the traveling motion and a slight modulation on top of this motion. 

thought of as accurately approximating a snapshot on the MTW. In Figure [7] we show a 
sequence of such "MTW snapshots" . 

We take snapshots u(x, ti),u(x, t 2 ), ■ ■ ■ , u(x, tu) at M different times ti, t 2 , ■ ■ ■ ,t M . We 
then make these snapshots artificially noisy by adding Gaussian white noise of variance a 2 
to each of them (to each of the 128 mesh points {a^}^®, we add a normal random variable 
of variance a 2 ). Without this noise, traditional single template-based approaches can do a 
very good job of factoring out the traveling motion of the MTW. With this noise, however, 
template-based approaches can fail spectacularly, while the eigenvector alignment method 
may still usefully resolve the global alignments. 

5.3. Alignment of MTW snapshots 

To find the alignment which aligns a (noisy) wave snapshot u(x,ti) with another 
u(x,tj), we simply set a^- equal to the fee {1,2, ...,128} which minimizes 



here Tjt is the periodic shift operator on the 128 mesh points {xi}] =1 28 defined by Tk[u](xi, t) = 
u(xk+i,t). The analogue of equation (128]) is also used to align a (noisy) MTW snapshot 
against a chosen (fixed) template. 

5-4- Template-based alignment attempts 

Although it is best to select a template with some prior knowledge, even relatively 
arbitrary choices (e.g. a "Mexican hat") may give good results. When the wave snapshots 
contain more than a little noise, however, alignment with a template will certainly give 



128 




(28) 



l=i 
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rise to many incorrect answers. Furthermore, even with no noise, poor template choices 
may result in spurious alignments. 

Figured] shows a MTW snapshot with added Gaussian white noise of variance a 2 = 3.5 2 . 
Clearly, this amount of noise will present a problem to alignment efforts: it is difficult 
to even visually perceive the resemblance with the noiseless MTW snapshot (Figure [6]) . 
Nevertheless, we attempt to align this snapshot (as well as others taken from our data set, 
with the same type of noise added) using different templates. The next series of figures 
shows 

• on the right, the alignment of each noisy wave snapshot in our data set with a 
particular single template (obtained by finding the periodic shift which, according to 
equation (|28|) . results in maximum correlation/minimum L 2 distance with the fixed 
template) vs. its correct alignment; ideally, this plot should consist of one straight 
line (after taking into account periodicity) 

• on the left, to demonstrate the degree of robustness of the alignment procedure, a 
plot of the L 2 distance between the template function and "all" periodic shifts of 
a single noisy MTW snapshot randomly chosen from our data set. This function's 
minimum is the alignment chosen for this noisy MTW by the template method (it 
maximizes the correlation/minimizes the L 2 distance), and it is this "best" alignment 
for all the snapshots that is plotted in the figure on the right. 

First, for reference, the alignment of noiseless snapshots with a template (here, the 
template was chosen to be a particular noiseless MTW snapshot) would appear like Figure 
|9j In this figure, as expected, the alignments obtained are nearly perfect (the figure re- 
sembles a straight line with small-in the L 2 norm- "gaps" caused by the modulation, which 
we will not study further here). These favorable results are expected s ince w e are using a 
mathematically motivated template (a "matched filter," see Papoulis f 19771 )) in noiseless 



conditions. In a slightly more realistic setting our snapshots will be noisy (and we still use 
a noiseless MTW snapshot as our template); this result is shown in Figure [TUl Again, the 
alignments obtained are nearly perfect (the small "gaps" also remain), but now there are 
a few errors. Of course, using a noiseless MTW snapshot as our template can be thought 
of as slightly "cheating" ; from our data set of noisy waves close to an MTW attractor, we 
do not know what an exact, noiseless MTW snapshot looks like. 

We tried several other template functions, including the Mexican hat, a cosine function, 
a step function (equivalently, the second Haar wavelet), and a triangle. Voting-based 
approaches were also tried; in these approaches, the results of multiple templates were 
averaged together in a suitable way in order to come up with a consensus. These voting- 
based approaches were also seen to fail; knowing how to average the votes together is 
a problem, and some templates have many local minima. Finally, center of mass- and 
moment-based alignment approaches also appeared to fail; this was not unexpected, since 
aligning based on moments is closely related to template alignmnent. Some of these figures 
are shown below. 
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Figure 9: Alignment of noiseless MTW snapshots with a single noiseless MTW as the template, 
expected, the alignment appears perfect. 
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Figure 10: Alignment of noisy MTW snapshots with a single noiseless MTW as the template. The resulting 
alignment is nearly perfect, but one can see that the robustness of Figure |H] has already started to wane; 
the range of the L 2 distances computed in this figure is much narrower than that of Figure [9j 




Periodic Shift Original Alignment 



Figure 11: Alignment of noisy MTW snapshots with a cosine. Although the 1? distance plot (left) is 
smooth, its range is much narrower than that of Figure |9] leading to poorer alignments. 
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Figure 12: Alignment of noisy MTW snapshots with a step function (the second Haar wavelet). Here, the 
L 2 distance plot exhibits a narrow range of values. 
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Figure 13: Alignment of noisy MTW snapshots with a triangle-shaped template. As in Figures T5.4I and 
15. 4[ the L 2 distance range is narrow and the alignments obtained are poor. 
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The only template to give a visually satisfactory answer was a (in principle, unavailable) 
noiseless MTW snapshot (again, see Figure [TUT) . Since the noiseless MTW template gave 
such good results, one might be tempted to try a noisy MTW from the data set as a 
template (which would not be considered cheating!). However, the performance of such a 
template is spectacularly poor: see Table [2] for summary statistics. 

5. 5. Application of the eigenvector method 

In the presence of so much noise (again, see Figure [H]), it is difficult to imagine aligning 
the noisy wave snapshots without prior knowledge of a good template such as the one 
provided by a noiseless MTW snapshot (Figure [TUI) . However, the eigenvector method 
takes into account information based on all pairwise alignments (in essence, treating each 
wave snapshot as a template, and looking at all MiM — l)/2 comparisons) and it is seen 
to give surprisingly good results. 

First, we compute all pairwise alignments between the M noisy wave snapshots by 
finding the alignment which minimizes the corresponding L 2 norm of their difference; 
clearly, many of these may be computed incorrectly. The alignment which rotates snapshot 
i to snapshot j, denoted Oy, is (for our spatially discretized waveforms) an integer between 
1 and 128, describing how many mesh points forward one must shift snapshot i in order 
for it to maximally correlate with snapshot j. This alignment is then mapped to the unit 
circle via = exp(— 2inaij / 128) , and a matrix T is constructed as follows: 



exp(— 2nran/128) 
exp(— 2i7ra2i/128) 



exp(— 2i7rai2/128) 
exp(— 2i7ra22/128) 



exp(— 2z7raMi/128) exp(— 2z7raM2/128) 



exp(— 2wraiM/128) 
exp(— 2i7ra 2 Af/128) 

exp ( — 2ina M M / 1 28 ) 



(29) 



In an ideal setting with no noise/no misalignments, the ij-th block of the matrix T would 
simply be exp[— 2i7r/128(a J ' — aj)] (where we denote the actual, unknown rotation of snap- 
shot % by Oj); this is the rotation which takes snapshot j back to the "phase" zero, and 
then rotates it by exp(2z7raj/128) in order for it to coincide with snapshot i. We also note 
that, in this ideal setting, the following equation holds: 



with 



Tv = Mv, 

exp(2z7rai/128) 
exp(2ma 2 /128) 

exp(2z7raM/128) 



(30) 



(31) 



The top eigenvector of T (with eigenvalue M) contains, therefore, information about 
the shifts aj (the "alignments"). In this setting, the matrix T is of rank 1, and it satisfies 
T = vv T , so T has two distinct eigenvalues: an eigenvalue of M whose multiplicity is 1 
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and an eigenvalue of whose multiplicity is M — 1. It is therefore expected that the top 
eigenvalue would not be affected too much by noise and misalignments. 

When there is some noise, v will still be approximately resolved (but now with eigen- 
value slightly less than M), and we are able to recover the information contained in this 
eigenvector regarding the alignments aj. The recovery will be, of course, only up to an 
overall global shift, which (since we only specify pairwise relative shifts) is an inherent 
degree of freedom. This can be seen in equation ( 13"Uj) ; this equation holds for not only 
v, but also for exp(z#)v (and, in fact, any constant times v). In fact, due to the noise, 
each recovered a, will not have exactly unit magnitude; yet the aj may be recovered by 
considering both the imaginary and real parts of the ith entry of v. In particular, we set 

128 flm(vi)\ 

x arctan — . (32) 



2tt \Re(v.,, 

The results of the eigenvector method constitute, without a doubt, a significant im- 
provement upon those obtained using the various fixed templates above (see Figure Upland 
Table [2j The eigenvalue distribution can be seen in Fi gure [151 one lar g e eige nvalue clearly 



dominates the rest. However, because the theory in iFeral fc Pechel (120071 ) requires the 
noise in each of the elements of the matrix M to be independent, we do not see the pre- 
dicted semicircle distribution in Figure US] (although a shape reminiscent of the semicircle 
can still be discerned). Again, this is due to the fact that "good" and "bad" measurements 
are not random, but rather, correlated; having independent entries requires M 2 sources 
of randomness, and clearly, for large M, this is not true because there are only 128 x M 
sources (M snapshots and 128 random Gaussian variables for each snapshot). Therefore, 
for large M, som e eigenvalues of magnitude O(M) appear outside the "semicircle"; see 



Cheng fc Singer! for details. 

For even larger amounts of noise and even smaller values of M, good results can still 
be obtained. In fact, the distance from the leading eigenvalue to the "noisy semicircle" 
quantifies the alignment error (see Section[3]). When M is large and the problem is relatively 
noiseless (so that in the context of Sectional p ~ 1), the distance from A# to A# is predicted 
to be large (again, see Section [3]); the position of the leading eigenvalue Xh increases as 
O(M) and the position of Xr increases as O(VM). 

5. 6. Additional denoising procedures 

Before concluding this example, we note that if we initially filter the noisy wave snap- 
shots, we observe better performance for both the eigenvector method and for some of the 
fixed templates. In the Fourier representation of the non-noisy KSE snapshots (convenient 
for spectral numerical discretization, but also known to be the o ptimal pr i ncipa l component 
(PCA) basis for systems with such translational symmetry, see Sirovich f 19871 )) the power 



spectrum is known to decay quickly. Therefore, we obtain an increased signal-to-noise ratio 
by first projecting each noisy wave snapshot onto its Fourier modes with power spectrua 
larger than some fixed threshold; the information about the underlying (non-noisy) MTW 
attractor which is thrown away by filtering these Fourier modes is small compared to the 
noise thrown away by filtering these Fourier modes. 
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Original Alignment 



Figure 14: Noisy MTW snapshot alignments obtained using the eigenvector method. A significant im- 
provement upon single template methods is observed. 
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-50 50 100 150 200 250 



Eigenvalue 

Figure 15: A histogram for the eigenvalues for M — 250 and a 2 = 2.5 2 . One dominant eigenvalue is 
near 180, and th ere are two others near 50 (these are likely due to the correlations in the matrix M, sec 
Cheng fc Singerl and Section [3]). The rest of the eigenvalues appear to belong to the "semicircle." 
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Table 2: Results of the eigenvector method (vs. a 1 and M) and the alignments with various templates (vs. 
a 2 ). For the eigenvector method, shown are the quantities A^ (the largest eigenvalue), A^ (the right edge 



of the semicircle) , p\ = ^= ^ e 



are the quantities p = tj X) e 



and p' x = 



i9i v(i) 



For the template methods, shown 



i exp(2i7rai/128)|, where the 9i are the true alignments and the are 
the alignments predicted by various templates: a cosine, the second moment, a triangle, the Mexican hat, 
the second Haar wavelet, a noiseless MTW snapshot (use of it is "cheating"), and a noisy MTW snapshot 
(not "cheating"). For a fixed amount of noise, Xh appears to increase approximately with M, while Xr 
increases approximately with \f~M. As the amount of noise increases (c 2 ), Xh decreases, Xr increases, 
and, as expected, both of the p\ decrease (this is expected both intuitively and mathematically, see Section 
[3]). Similarly, other template-based correlations p increase with M and decrease with decreasing noise. 
Clearly, the only competitive template is a noiseless snapshot of the MTW itself. 
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pi 

Position 



2 pi 



Figure 16: A sequence of aligned MTW snapshots. In contrast to Figure[71 the traveling has been factored 
out and only the modulation remains. Although the eigenvector method performs well on noisy snapshots 
(see, e.g., Figure [T^|) . we chose to show noise-free MTW snapshots for visualization purposes. 



6. Post-processing the aligned data of the Kuramoto-Sivashinsky equation through 
the use of diffusion maps 

In the example of the KSE wave snapshots (Section ED, we conveniently allowed our- 
selves to ignore the shape modulation superposed to the traveling motion when seeking 
their global alignments. The reason is that this modulation is comparatively small in L 2 
norm, and therefore, it contributes little to the sum in equation f l28|) . We were able to 
recover, with good accuracy, the global alignments of the noisy wave snapshots (see Figure 

ED. 

With the global alignments recovered, we rotate each snapshot so that the traveling 
motion is factored out and only the modulation remains. When there is no noise, the 
aligned sequence of wave snapshots takes the form of Figure [16] (with noise it is too hard 
to visually perceive the modulation, so we do not include such a figure). 

Given the aligned data, we now perform diffusion maps in order to search for "coarse 



Sondav et al. 


(2009 


); 


Das et al. 


(2006 





as m 



Lafon fc Led d2006l) : 



To 

construct an informative low-dimensional embedding for this data set of M (noisy but 
aligned) snapshots, we start with a similarity measure between each pair of snapshots 
u(x,ti), u(x,tj). The similarity measure is a nonnegative quantity Wy = satisfying 
certain additional "admissibility conditions" ( Coif man et al.l ( 2005a| )). Here, we choose the 
Gaussian similarity measure, and construct a matrix W as 



W 



exp 



J2f=i [u(xk, U) - u(x k , tj)]' 



(33) 
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In this equation, e defines a characteristic scale which quantifies the "locality" of the 
neighborhood withi n which Euclidean dis tance can be used as the basis of a meaningful 



similarity measure ( jCoifman et al.l (I2005ar0. A systemat i c app roach to determining appro 



priate e values is discussed in iGrassberger fc Procaccial (119831 ) . Next, we create a matrix 



K which is a row-normalized version of W: 

Finally, we look at the top few eigenvalues and eigenvectors of the matrix K. In MATLAB, 
for instance, this can be done with the command [V, L] = eigs(K,n + 1), where n + 1 is 
the number of top eigenvalues we wish to keep (we typically are only interested in the first 
few). 

This gives a set of real eigenvalues Ao > Ai > ... > A n > with corresponding eigen- 
vectors {ipj}"j =0 - Since K is stochastic, A = 1 and ipQ = [1 1 ... 1] T . The n-dimensional 

representation of the z-th snapshot u(x, tj) is given by the diffusion map 
where 

a mapping which is only defined on the M recorded snapshots. Here, t represents the "dif- 
fusion time"; to keep things simple, we choose t — 1. In other words, snapshot i is mapped 
to a vector whose first component is the ith component of the first nontrivial eigenvector, 
whose second component is the ith component of the second nontrivial eigenvector, etc. 
If a gap in the eigenvalue spectrum is observed between eigenvalues A n and A n +i , then 



y„, m a y provide a useful lo w-dimensional representation of the data set ( iBelkin fc Niyogi 
(l2003h : lNadler et all tod )). 



When we apply diffusion maps to the (aligned but noisy) wave snapshot data, our 
eigenvalues are 1.00, 0.90, 0.87, 0.62, 0.43, . . . Clearly, there is a gap between 0.87 and 
0.62. Therefore, we expect the first two nontrivial eigenvectors to give a parametrization 
of the residual, "symmetry-adjusted" dynamics corresponding to the modulation. These 
two eigenvectors are shown in Figure [TTJ There is a continuous one-to-one map between 
each possible modulation phase and the set of points on the unit circle, since the data lie 
very close to the attracting modulated traveling wave, for which the modulation is exactly 
periodic in time. We thus expect the first two nontrivial eigenvectors to trace out some sort 
of circle or "loop"; the eigenfunctions of simple diffusion on a closed curve are sin(2s7r/L) 
and cos(2s7r/L), where s is some arclength parameter. The eigenvectors shown in Figure 
[T7]do not trace out an exact circle, but the plot is reminiscent of that shape. In fact, by 
looking at the quantity 



to 



^ = arctan ( -^y ) , (35) 



we can assign a number Tj G [0, 2tt) to each snapshot, parameterizing the modulation. 
When we plot r against a known parametrization of the modulation, we obtain Figure 
IT8l As the two quantities are approximately one-to-one (modulo 2tc), it is clear that our 
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Figure 17: The two diffusion map coordinates (the first two nontrivial eigenvectors of K) obtained from 
aligned, but noisy, MTW snapshots (right) and aligned, noise- free MTW snapshots (left). 



diffusion map analysis has been successful in parameterizing the modulation, the residual 
dynamics of the symmetry-adjusted snapshots. Given the small size of the modulations 
compared to the overall noise of the problem, this is encouraging. 

7. Vector diffusion maps 

In the preceding sections, we were able to take advantage of the eigenvector alignment 
method to provide information about the global alignment of ensembles of snapshots in 
two illustrative pattern-forming systems with symmetry. In the case of the orientational 
probability distributions of nematic liquid crystals (Section HJ), all M snapshots were in 
principle rotated versions of the same distribution function; due to the finiteness of the 
representation, however (each was a collection of iV representative particles), noise be- 
came a feature of the problem. For the spatiotemporally varying wave snapshots of the 
Kuramoto-Sivashinksy equation (Section EJ, we were able to apply the eigenvector method 
to factor out the traveling component of the variation, even though each snapshot was not 
exactly the same up to rotation. We were successful because the modulation (superposed 
to the traveling component of the evolution) was relatively small. We then applied diffusion 
maps to the aligned snapshots and successfully recovered a meaningful, low- dimensional 
representation of the residual dynamics (the modulation). 

Now, suppose that in the case of the LCP orientational probability distributions, the 
set of M snapshots contained not only rotated realizations of the same (noisy) distribution, 
but also randomly rotated versions of snapshots that had evolved for different lengths of 
time. The differences in the M snapshots would then be due to 

• different finite particle realizations of the distribution function (only in the case 
N — > oo do they become the same); 
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Figure 18: Parametrizations of the modulations via diffusion maps. On the x axes, tq and Tpj ("clean" and 
" noisy," respectively) , are computed from the diffusion map eigenvector information in Figure 1171 On the 
y axes, Tjc nown is the "correct" parametrization. The plot of Tj( nown vs. tq is shown just for comparison, 
for tc and TKnown differ by only a phase offset. The two figures are roughly one-to-one (modulo 2tt). 



• different rotations of these distribution function realizations; and 

• the fact that the distribution function changes with time. 

In such a situation, it might not be prudent to try to align two orientational probability 
distribution functions of vastly different shapes (these may arise from evolution over appre- 
ciably different lengths of time). A similar situation might arise if the modulation in our 
traveling/modulating wave snapshots is not small: pairwise ali gnments of vas tly different 
shapes would stop being meaningful. Vector Diffusion Maps (ISinger fc Wuf ) provide an 
approach that, in such circumstances, both help obtain global alignments and also reveal 
the underlying "symmetry-adjusted" reduced dynamics all in one step. 



7.1. A brief introduction to vector diffusion maps 

The reduced descriptions of the dynamics obtained by diffusion maps (as we did in the 
KSE example above) rely on the user's ability to provide a pairwise similarity measure 
Wy between snapshots i and j. From there, the largest eigenvalues (and corresponding 
eigenvectors) of a matrix K are computed, where 
KSE wave snapshots, we set 



In the case of the 



W, 



exp 



El=i [u{x k ,U) -u(x k ,tj)Y 



(36) 



(see Section [6]). Intuitively, the eigenvectors of K corresponding to the largest eigenvalues 
are those related to the mo st robust diffusions in a graph whose vertices are the data (see, 
e.g. iBelkin fc Niyogil ( 120031 )); if snapshot i is "close" to snapshot j in diffusion map space, 
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then it should be possible to transition from the one to the other easily through mutually 
neighboring snapshots k, neighbors of neighbors, etc. 

Likewise, the global alignments provided by the eigenvector method rely on the user 
to first compare all snapshots in a pairwise fashion so as to obtain the group element 
Qij G G which "best" aligns them, and then incorporate the real/complex representation 
of this group element, Oy, into the ij-th block of a matrix. In the case of the KSE wave 
snapshots, we denoted this group element Oy as 

Tij = exp(-2i7ra li /128) (37) 

(see Section [3]). Intuitively, the eigenvector of O with largest corresponding eigenvalue 
corresponds to the most consistent global alignment; if snapshot i can be rotated to snap- 
shot j via g^, then snapshot i should also be able to be rotated to snapshot j through a 
snapshot k (via gikQkj)- 

Vector diffusion maps attempts, in a sense, to combine the two methods (the eigenvec- 
tor method and diffusion maps). To use vector diffusion maps, one first optimally aligns 
two snapshots i and j to obtain #y and thus Oy; one then computes the similarity of i 
and j after this alignment has taken place to obtain Wjj (and, after normalization, Ky). 
A matrix S is then formed whose ij-th block is simply Sy = KyOy. The eigenvectors 
of S corresponding to its largest eigenvalues are computed, and these eigenvectors provide 
information about both symmetry adjustment ("alignment") and about dynamic similar- 
ity. Distances between snapshots in this new vector diffusion map space ar e called vector 



diffusion distances (see equations (4.2) and (4.6) on p. 11 of ISinger fc Wul ). As we noted 
above, alignment comparisons between snapshots should only be trusted when W y - is not 
small, for it may not make sense to compare two snapshots which differ appreciably (e.g. 
in shape and/or in temporal evolution time ). Vector diffusion maps accomplishes this by 
effectively ignoring comparisons Oy for snapshots which are "far away" (small Wy) from 
each other. 



7.2. Application of vector diffusion maps to the spatiotemporal wave snapshots of the KSE 
To apply vector diffusion maps to the KSE example, we form the matrix S by setting 

Sy = TyKy, (38) 

where the the Ty- are obtained by optimally aligning each pair of noisy wave snapshots, 
and the Ky are then computed on the symmetry-adjusted wave snapshots (these Ky are, 
as before, W ti /^ W ik ). 

The top eigenvectors of S are then computed, and the eigenvalues are exactly as in 
Section[6j 1.00, 0.90, 0.87, 0.62, 0.43, . . . This is not surprising, for this particular problem 
actually "decouples"; the modulation is independent of the traveling motion for an exact 
modulated traveling wave (in other, more general problems, this is unlikely to be the 
case). The first eigenvector v , the one corresponding to eigenvalue 1.00, reveals the global 
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alignments (see Section 15. 5p and has the form 



v 



exp(2Mrai/128) 
exp(2z7ra 2 /128) 

exp(2i7raAf /128) 



(39) 



The next two eigenvectors reveal the diffusion map parametrization of the underlying, 
symmetry- adjusted dynamics (the modulation, see Section [6]). These eigenvectors are 
"corrupted" because they also contain the global alignments: 



Vl 2 



exp(2z7rai/128)V>i;2 
exp(2i7ra 2 /128)V>i : 



'(i) i 



'(2) 



exp {2maM / 128)^1 



\M) 



(40) 



However, one can easily get v x and v 2 back to their more meaningful, "original" forms ip\ 
and ip2 of Section O by simply dividing each of them entrywise by v. 

Since this particular example "decouples," the global alignments and eigenvectors ob- 
tained this way agree with those already computed and shown in Figures [TU and [171 we 
do not plot them again here. 



8. Summary and conclusions 



In this paper we applied both the "eigenvect or method" (jSingerl (120111 ) ; ISinger et al 



Singer fc Shkolniskyl ) . and vector diffusion maps ( Singer fc Wu ) (based on the eigenvector 
method) to adjust data ensembles (consisting of snapshots from two evolving systems) with 
respect to the system intrinsic symmetries. We demonstrated the ability of both vector 
diffusion maps and the eigenvector method to align (and in a sense, denoise) the data sets, 
and also parameterize their symmetry-adjusted dynamics. For both examples, the eigen- 
vector method provided a global alignment of the noisy snapshots of the evolving systems, 
even with a small signal-to-noise ratio. Additionally, for the case of traveling and modulat- 
ing waves, vector diffusion maps were shown to both remove the underlying symmetry and 
capture the underlying long-term dynamics (the residual dynamics of modulation, after 
the "traveling" symmetry has been removed). 

The two techniques are f ast and easy to imp l emen t, and as discussed, they are a natural 
analogue to diffusion maps (ICoifman fc Lafonl (120061 ) ) in the sense that they rely on pair- 



wise comparison data. This information is incorporated into an eigenvalue problem, whose 
result is a globally consistent (in a certain sense, see Section [3]) parametrization/alignment 
of the underlying data set. Just as diffusion maps are robust to noise in the computation 
of the pairwise similarity measurements, vector diffusion maps and the eigenvector method 
are robust to both noise and alignment error in the computation of both the pairwise 
similarity measurements and symmetry group members. 
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By taking into account the equivariance of the system dynamics with respect to the 
underlying symmetry, vector diffusion maps may reduce the amount of data required in 
order to successfully elucidate an effective, low-dimensional description of the dynamics. 
Despite the success of nonlinear dimensionality reducti on technique s in finding meaning- 



ful reduced descriptions for complex systems (see, e.g. lErban et al.l (120071 ); ISonday et al. 
( 120091 ); iDas et al.l ( 120061 )). they still suffer from the curse of dimensionality; in general, the 
amount of data required to successfully recover d "intrinsic" dimensions grows exponen- 
tially with d. Factoring out dimensions associated with the symmetry degrees of freedom 
will partially alleviate of this problem. While diffusion maps treats the snapshots as living 
on a manifold M., vector diffusion maps in effect treats the snapshots as if they live in 
the quotient space M./G. This implicit reduction of dimensionality allows the methods 
presented in this paper to provide an improved organization of the data. 
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Appendix: initialization of a probability distribution with the Metropolis- 
Hastings algorithm 

To initialize N particles {wj j^ 1 on the unit spher e cons istently with some ip(u), we use 



the Metropolis-Hastings algorithm ([Metropolis et al.l ( 119531 )). This algorithm may be used 



to design a Markov chain with stationary distribution equal to the desired ip(u). After 
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an initial "relaxation" period of a few iterations, consecutive states w fc of the chain are 
statistically equivalent to samples drawn from ip(u). 

An auxiliary distribution g(«|u), for example, a multivariate normal distribution with 
some mean vector and covariance matrix, is first selected. This q distribution is used to 
generate, from the current state w^, a potential next state w candi . q may be tuned carefully 
to reduce the variance in the empirically observed stationary distribution of the Markov 
chain; for our purposes, we choose to keep things simple and use q — 1, meaning that at 
each step, we randomly generate a point w cand on the unit sphere with no regard to the 
point w k from which it originated. A candidate state w cand generated by the auxiliary 
distribution is accepted with probability 



If the candidate w cand is accepted, the next state becomes w fc+1 = w cand , otherwise if 
w cand is rejected, the next state remains the same as the current state Wfc +1 = w^. After 
running the Metropolis-Hastings algorithm for a large number of iterations, we subsample 
the Markov chain to reduce it to N particles {w i }^ 1 on the unit sphere. These N particles 
become our consistent initialization according to ip(u). 
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